Loading…
Tuesday, December 14 • 11:15 - 12:05
Lessons Learned From the Utilization of Machine Learning Pipelines in Production Environment - Kyosuke Hashimoto & Masahiro Ito, Hitachi, Ltd.

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Today more and more machine learning (ML) technologies are applied to business. In ML projects, data scientists train ML models and generate logics by Jupyter notebook in experimental environment, which are intended to be executed only once. Then engineers have difficulty to modify those logics into reusable, composable, and scalable ones in production environment. These hard tasks for engineers can be the bottleneck of the delivery process of the ML models. We modified logics for an experimental environment into pipeline by Kedro – OSS framework for designing ML pipelines – for production environment to make maintainable code with less effort. By Kedro, we could decompose complex logics into set of reusable modules called "nodes" by clarifying the dependencies and relations between Jupyter cells. Also, we could compose new pipeline by replacing “nodes” as the specification changes of system. Besides, we could scale up resources by executing independent "nodes" in parallel. We found that modified pipeline by Kedro enabled us to replace and scale logics more easily. On the other hand, we found that manual conversion from Jupyter to pipeline is still heavy task for data scientists. We will discuss problems of manual conversion, and possible solution for the problems.

Speakers
avatar for Kyosuke Hashimoto

Kyosuke Hashimoto

Researcher, Hitachi, Ltd.
Kyosuke HASHIMOTO is a Researcher of Lumada Data Science Laboratory at Hitachi. He has 7 years of experience in cloud computing, including virtual network and enterprise system management. Currently, he is focusing on the study of development and management of machine learning sy... Read More →
avatar for Masahiro Ito

Masahiro Ito

Engineer, Hitachi, Ltd.
Masahiro Ito has been working on development of big data and AI solutions with Apache Hadoop and its related open-source software. He is currently focusing on offering and co-creating MLOps solutions for customers who are going to build enterprise systems. So far, he has written the... Read More →



Tuesday December 14, 2021 11:15 - 12:05 JST
AI + Data Theater